Temporal Models and Smoothing
Data are often observed in time, and time dependence is often expected.
Note: We can use the same model to smooth covariate effects!
Smoothing of the time effect
Prediction
We can “predict” any unobserved data, does not have to be in the future
Time can be indexed over a
Discrete domain (e.g., years)
Continuous domain
Time can be indexed over a
Discrete domain (e.g., years)
Main models: RW1, RW2 and AR1
Note: RW1 and RW2 are also used for smoothing covariates
Continuous domain
Goal we want understand the pattern and predict into the future
Random walk models encourage the mean of the linear predictor to vary gradually over time.
They do this by assuming that, on average, the time effect at each point is the mean of the effect at the neighboring points.
Random Walk of order 1 (RW1) we take the two nearest neighbors
Random Walk of order 2 (RW2) we take the four nearest neighbors
Idea: \(\longrightarrow\ u_t = \text{mean}(u_{t-1} , u_{t+1}) + \text{Gaussian error with precision } \tau\)
Definition
\[ \pi(\mathbf{u} \mid \tau) \propto \exp\!\left( -\frac{\tau}{2} \sum_{t=1}^{T-1} (u_{t+1} - u_t)^2 \right) = \exp\!\left(-\tfrac{1}{2} \, \mathbf{u}^{\top} \mathbf{Q}\ \mathbf{u}\right) \] where the precision is \(\mathbf{Q} = \tau\mathbf{R}\) with
\[ \mathbf{R} = \begin{bmatrix} 1 & -1 & & & & \\ -1 & 2 & -1 & & & \\ & & \ddots & \ddots & \ddots & \\ & & & -1 & 2 & -1 \\ & & & & -1 & 1 \end{bmatrix} \]
Idea: \(\longrightarrow\ u_t = \text{mean}(u_{t-1} , u_{t+1}) + \text{Gaussian error with precision } \tau\)
Definition
\[ \pi(\mathbf{u} \mid \tau) \propto \exp\!\left( -\frac{\tau}{2} \sum_{t=1}^{T-1} (u_{t+1} - u_t)^2 \right) = \exp\!\left(-\tfrac{1}{2} \, \mathbf{u}^{\top} \mathbf{Q}\ \mathbf{u}\right) \]
\(\tau\) says how much \(u_t\) can vary around its mean
We need to set a prior distribution for \(\tau\).
A common option is the so called PC-priors
inlabru for many model parametersThey are build with two principle in mind:
A line is the base model
We want to penalize more complex models
PC prior are easily available in inlabru for many model parameters
They are build with two principle in mind:
\[ \text{Prob}\left(\sqrt{\frac{1}{\tau}}>U\right) = \text{Prob}(\sigma>U) = \alpha; \qquad U>0, \ \alpha \in (0,1) \]
\(U\) an upper limit for the standard deviation and \(\alpha\) a small probability.
\(U\) a likely value for the standard deviation and \(\alpha=0.5\).
The Model \[ \begin{aligned} y_i|\eta_i, \sigma^2 & \sim \mathcal{N}(\eta_i,\sigma^2)\\ \eta_i & = \beta_0 + f(t_i)\\ f(t_1),f(t_2),\dots,f(t_n) &\sim \text{RW2}(\tau) \end{aligned} \]
RW1 defines differences, not absolute levels:
Only the changes between neighboring terms are modeled.
The model has no information about the global mean (intercept).
Mathematically, \[ (u_1,\dots,u_n)\text{ and }(u_1+a,\dots,u_n+a) \] produce identical likelihoods — they’re indistinguishable.
This means:
The precision matrix \(\mathbf{Q}\) is singular.
Posterior inference is not well-defined unless we fix the overall level.
Solution:
Just like RW1, but now we consider 4 neighbours instead of 2 \[ u_t = \text{mean}(u_{t-2} ,u_{t-1} , u_{t+1}, u_{t+2} ) + \text{some Gaussian error with precision } \tau \]
RW2 are smoother than RW1
The precision has the same role as for RW1
cmp1 = ~ Intercept(1) +
time(year, model = "rw1",
scale.model = T,
hyper = list(prec =
list(prior = "pc.prec",
param = c(0.3,0.5))))
cmp2 = ~ Intercept(1) +
time(year, model = "rw2",
scale.model = T,
hyper = list(prec =
list(prior = "pc.prec",
param = c(0.3,0.5))))
lik = bru_obs(formula = Erie~ .,
data = lakes)
fit1 = bru(cmp1, lik)
fit2 = bru(cmp2, lik)NOTE: the scale.model = TRUE option scales the \(\mathbf{Q}\) matrix so the precision parameter has the same interpretation in both models.
Latent effects suitable for smoothing and modeling temporal data.
Has one hyperparameter: the precision \(\tau\)
It is an intrinsic model
The precision matrix \(\mathbf{Q}\) is rank deficient
A sum-to-zero constraint is added to make the model identifiable!
RW2 models are smoother than RW1
Definition
\[ u_t = \phi u_{t-i} + \epsilon_t; \qquad \phi\in(-1,1), \ \epsilon_t\sim\mathcal{N}(0,\tau^{-1}) \] \[ \pi(\mathbf{u}|\tau)\propto\exp\left(-\frac{\tau}{2}\mathbf{u}^T\mathbf{Q}\mathbf{u}\right) \] with \[ \mathbf{Q} = \begin{bmatrix} 1 & -\phi & & & & \\ -\phi & (1+\phi^2) & -\phi & & & \\ & & \ddots & \ddots & \ddots & \\ & & & -\phi & (1+\phi^2) & -\phi \\ & & & & -\phi & 1 \end{bmatrix} \]
The AR1 model has two parameters
The AR1 model has two parameters